CVE-2026-53923

0.0

vLLM GGUF Kernels: int64_t to int truncation of tensor dimensions causes GPU buffer overflow

Overview

Overview

Description

vLLM is an inference and serving engine for large language models (LLMs). From 0.5.5 until 0.23.1rc0, integer truncation of tensor dimensions in vLLM's GGUF dequantize kernels (csrc/quantization/gguf/gguf_kernel.cu) causes partial tensor processing. The output tensor is allocated at full size via torch::empty (uninitialized memory), but the dequantize CUDA kernel processes only a truncated number of elements. The unfilled portion of the output tensor retains whatever was previously in GPU memory. In multi-tenant inference deployments, this residual GPU memory may contain tensor data from other users' inference requests, constituting information disclosure. This vulnerability is fixed in 0.23.1rc0.

Details

INFO

Published Date :

June 22, 2026, 9:55 p.m.

Last Modified :

June 22, 2026, 9:55 p.m.

Remotely Exploit :

No

Source :

GitHub_M

Impact

Affected Products

The following products are affected by CVE-2026-53923 vulnerability. Even if cvefeed.io is aware of the exact versions of the products that are affected, the information is not represented in the table below.

ID	Vendor	Product	Action
1	Vllm-project	vllm

: Total Affected Vendor : 1 | Products : 1

Solution

Update vLLM to version 0.23.1rc0 or later to fix information disclosure.

Update vLLM to 0.23.1rc0 or later.
Review tensor processing for truncation vulnerabilities.

We scan GitHub repositories to detect new proof-of-concept exploits. Following list is a collection of public exploits and proof-of-concepts, which have been published on GitHub (sorted by the most recently updated).

Results are limited to the first 15 repositories due to potential performance issues.

The following list is the news that have been mention CVE-2026-53923 vulnerability anywhere in the article.

Results are limited to the first 20 news articles due to potential performance issues.

EPSS is a daily estimate of the probability of exploitation activity being observed over the next 30 days. Following chart shows the EPSS score history of the vulnerability.

Information Disclosure

Browse by Apps

CVE-2026-53923

vLLM GGUF Kernels: int64_t to int truncation of tensor dimensions causes GPU buffer overflow

Description

INFO

June 22, 2026, 9:55 p.m.

June 22, 2026, 9:55 p.m.

No

GitHub_M

Affected Products

Solution

Vulnerability Scoring Details

Cookie Preferences